Alternative sampling for variational quantum Monte Carlo 
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Expectation values of physical quantities may accurately be obtained by the evaluation of integrals 
within Many-Body Quantum mechanics, and these multi-dimensional integrals may be estimated 
using Monte Carlo methods. In a previous publication it has been shown that for the simplest, 
most commonly applied strategy in continuum Quantum Monte Carlo, the random error in the 
resulting estimates is not well controlled. At best the Central Limit theorem is valid in its weakest 
form, and at worst it is invalid and replaced by an alternative Generalised Central Limit theorem 
and non-Normal random error. In both cases the random error is not controlled. Here we consider 
a new 'residual sampling strategy' that reintroduces the Central Limit Theorem in its strongest 
form, and provides full control of the random error in estimates. Estimates of the total energy 
and the variance of the local energy within Variational Monte Carlo are considered in detail, and 
the approach presented may be generalised to expectation values of other operators, and to other 
variants of the Quantum Monte Carlo method. 

PACS numbers; 02.70.Ss, 02.70.Tt, 31.25.-v 



A primary problem in solving for the ground states of 
many body quantum systems is the evaluation of 3A^- 
dimensional integrals, where N is the number of parti- 
cles interacting in 3-dimensional space. This paper con- 
siders estimates of expectation values of a many body- 
trial wavcfunction and operator combinations, with par- 
ticular emphasis on those used for the optimisation of 
a trial wavcfunction via a parameterised freedom within 
that wavefunction. Monte Carlo (MC) methods provide 
a powerful numerical tool for evaluating these integrals 
by expressing the exact integral as an expectation value. 
By constructing a sample estimate of this expectation 
value, such problems can be made tractable. 

The resulting estimate is a sample taken from a ran- 
dom distribution, so some knowledge of this distribution 
and its relationship with the underlying 'true' value must 
be available for it to be useful. Past work in Quantum 
Monte Carlo has taken this distribution to be Normal, 
usually justified by expressing the estimates as sums of 
random variables and assuming the validity of the Cen- 
tral Limit Theorem (CLT). It has recently been shown 
that for the usual implementation of QMC (referred to 
as 'standard sampling') this is only partly true for es- 
timates of the total energy, and completely untrue for 
estimates of the (residual) variance of the local energy. 
These two quantities are the most prominent estimated 
quantities in Variational Monte Carlo (VMC). For the 
first of these the deviation of random errors from Nor- 
mal may be significant for a finite number of samples in 
the VMC calculation, with outliers occurring. For the 
second of these the random error are not Normal even 
in the large sample size limit, and large outlier errors are 
orders of magnitude more likely than the CLT suggests. 
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Such non-Normal distributions of errors are a direct con- 
sequence of the presence of singularities in the sampled 
quantities at the nodal surface. These singularities may 
not easily be prevented, and have been found to result 
in the failure of the CLT for estimates of many physical 
expectation values sought using QMC methods. 

In what follows a new sampling strategy, referred to 
as 'residual sampling', is developed that reintroduces the 
CLT in its strongest form for estimates of the total en- 
ergy and (residual) variance of the local energy. The pa- 
per consists of 6 sections. In section U the new sampling 
strategy is described. Sections [Tl] and IIIII describe the 
construction of estimates of the total energy and resid- 
ual variance within this sampling strategy, and derive 
the distribution of random errors and confidence inter- 
vals for the estimates. Section IIVI considers the general 
conditions that a sampling strategy must satisfy in or- 
der for the CLT to hold for a given estimated quantity, 
so justifying the choice of sampling strategy. Analytical 
results, or numerical results for the example case of an 
isolated all-electron carbon atom are presented in each 
section as appropriate. Section IVl considers how an esti- 
mate/sampling strategy combination may be chosen such 
that the CLT is valid for an estimate of a physical quan- 
tity of interest, and the example of the electronic kinetic 
energy is considered. Section IVTl concludes the paper. 

Before commencing we note that this paper is the sec- 
ond of two closely related papers. The preceding pa- 
per, [ij, develops the statistical description of the ran- 
dom error inherent in QMC, and derives the deficiencies 
of the standard sampling method. In the current paper, 
new sampling strategies are developed, together with an 
analysis of the accompanying random errors in estimates. 
This provides a method for avoiding the deficiencies of 
standard sampling by controlling the random error and 
introducing a valid CLT for an estimate of interest. 
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I. GENERAL SAMPLING IN VMC, AND A 
NEW SAMPLING STRATEGY 

Generally, VMC involves generating a statistical es- 
timates of the expectation values of an operator of the 
form 



G 



(1) 



Expressing this in terms of the statistical expectation of 
a function Gl = V'~^ff'0 sampled over a random distri- 
bution of 3A^ dimensional vectors, R, with Probability 
Density Function (PDF) /'(R), gives 



G 



\[Gl^VP-P] 



(2) 



where E [x\ P] = j xPdR is the definition of the expec- 
tation. The function G'l(R) is the 'local value' of the 
operator/trial wavefunction combination. This is true 
for any general distribution, P. 

This can also be expressed as statistical estimates con- 
structed from samples taken from P. Introducing the 
notation [/] for an estimate of / constructed using r 
samples, gives 



A. [G] = 



E[GL^yP;P] 
E[^^/P;P]+Xr 



G + \Nr 



(3) 



where W^, Yr, and are random error variables. The 
random variable is not normal, but and may 
be, and may be correlated to some degree. 

The 'standard sampling' solution is to choose P — Xip'^, 
with A an unknown normalisation constant, so that 



Ar [G] = -J2GL{Rn) = G + Yr 



(4) 



and Xr = 0. As has previously been shown flj, singu- 
larities in Gl can easily prevent the distribution of Y,. 
from being Normal by invalidating the CLT. Although 
standard sampling provides the simplest analytic form 
for a MC estimate, there is nothing to suggest that it is 
optimum for controlling the statistical error in [G] . 

Returning to general sampling complicates the analy- 
sis, but provide a means of influencing the random error 
present in estimated quantities since the distribution of 
the random error, W^, is influenced by the choice of sam- 
pling distribution, P. 

Writing the general sampling distribution as 



A 



(5) 



where A is an unknown normalisation factor, provides the 
estimate of G in the more concise form 



A. [G] 



E[wGl;P] + 
E [w; P] + X,. 



(6) 



In order to control the statistics of estimates of the 
total energy and (residual) variance, we begin by intro- 
ducing the local energy. El = tp~^Hip, defined in terms 
of the Hamiltonian operator, H. We then limit ourselves 
to weights that are functions of the local energy, w{El), 
and to operators of the form g = f{H). Expectation val- 
ues of this operator, F, then have MC estimates given 
by 



A. [F] 



E»^i^^(E»)/l(E„) 

ELl^(En) 



P{E)^P,{E) (7) 



where E„ is the n*^ independent identically distributed 
(IID) random variable defined as the sample local energy 



at R„, and distributed as 



A 



w{E) 
A' 



j{E)^^'- 



(8) 



where A' is a further unknown normalisation constant, 
and the integral is taken over a 3A^— 1 dimensional surface 
of constant local energy In the last line, P^2{E) is 
the distribution of local energies that occurs for standard 
sampling. 

Note that w{E) = 1 results in standard sampling, with 
the E^^ leptokurtotic tails for P^{E), and the resulting 
CLT issues for VMC. The essential feature of this ap- 
proach is that different choices of weight function, w{E), 
provide different estimators for F, with a different distri- 
butions of random error in the estimates. 

'Residual sampling' is defined by choosing the weight 
function to take the particular form 



w{E) 



{E - For + ' 



(9) 



where (i?o,e) are parameters that influence the random 
error in the estimate. Equation ([9]) may be interpreted 
as interpolating between a perfect sampling of the nu- 
merator and denominator of an estimate of the residual 
variance. This weight function ensures that, provided 
f{E) increases quadratically or slower in the limit of E 
approaching infinity from above or below, the sampled 
quantities will be bounded from above and below even in 
the presence of singularities in the local energy. It is the 
introduction of this boundary to the sample values that 
results in the re- introduction of the CLT, as described in 
the next section. A further significant difference between 
standard and residual sampling is that the former does 
not sample in the region of the nodal surface, whereas 
the latter does. 

From this point on, w{E) refers to Eq. and the ac- 
companying distribution of samples in multi-dimensional 
space is given by 



P,iR)^XiP^{R)/w{EL{R)). 



(10) 



3 



Sampling and estimation using this distribution is 
straightforward to implement in standard Monte Carlo 
algorithms by using the new distribution at each 
Metropolis step, and by including w{E) when evaluat- 
ing estimates of expectation values. 

Values are required for (-Eo,e) to define the sampling 
strategy, but only influence the distribution of random 
errors in the estimate. Optimum values (in the sense of 
resulting in the smallest random error) exist and may be 
sought for a given calculation, but roughly speaking a 
good choice of Eq can be expected to be an approximate 
total energy, and a good choice of e an estimate of the 
accuracy of Eg. 

Two limits exist. For e — * oo the sampling is as for the 
standard samphng. For {Eq, e) {Etot,0) (with Etot the 
expectation value of the trial wavefunction/Hamiltonian 
combination) the sampling is perfect for the numerator 
of the residual variance estimator, and all the statistical 
error is in the denominator. For any error in Eq and 
any finite value of e this sampling scheme is somewhere 
between these two extremes, hence the numerator is sam- 
pled more efficiently at the cost of introducing more error 
in the denominator. Of course this sampling strategy is 
only of interest if the estimate converges to the true value 
for increasing sample size (r), has a controlled error, and 
is insensitive to the values of the parameters {Eq, e). 

Now that the residual sampling strategy is defined, es- 
timates for the total energy and residual variance are 
considered. These are of interest in their own right, 
and from the point of view of wavefunction optimisation 
methods. The next two sections define these estimates, 
analyse their statistical properties, and obtain distribu- 
tions of the random error in the large r limit. In addi- 
tion numerical VMC calculations for an all-electron car- 
bon atom are performed using both standard and resid- 
ual sampling strategies, in order to demonstrate the new 
sampling strategy for a real system. 

It should be borne in mind that many statements 
about standard sampling are not true for a more gen- 
eral sampling method. An important example is that 
the residual variance that is to be estimated is not the 
second moment of the sampled quantity, and is unrelated 
to the error in the total energy estimate. 

II. TOTAL ENERGY ESTIMATES AND 
CONFIDENCE LIMITS 

The residual sampling estimate of the total energy 
takes the form 

A.[i?,„,] = ^^fi^^%^, P{E)^P,{E). (11) 

In the standard sampling limit P{E) possesses £'^'* 
asymptotes [l[, but for finite e the w{E)~^ term in Eq. ([8]) 
results in E~'^ asymptotic tails. 

In order to characterise the random error of this esti- 
mate, due consideration must be taken of the estimate 



being made up of a quotient of two random variables. 
Although w(E„) and E„ are causally related there is no 
reason to expect this causal relationship to hold between 
sums of these random variables, hence the numerator and 
denominator are only partially correlated. This observa- 
tion provides the required route to describing the statis- 
tics. 

Defining 

(Y„,X„) = (7«(E„)E„,w(E„)) (12) 

provides a bivariate random variable with a PDF that is 
non-zero only on a parametric curve. A normalised sum 
of these IID random bivariates gives a new bivariate 

(M2,Mi) - (i^Y„,l^Xj , (13) 

with a PDF, i-'r(/i2, /^i), that can be derived using a 
standard convolution/Fourier transform approach 2], and 
limit theorems obtained. Note that Pr{i-P2, /^i) is not lim- 
ited to a parametric curve in the two dimensional space 
unless r = 1. [l^ 

The total energy estimate is then provided by 

Mo 

A.[i?tot]-^, (14) 

and associated confidence limits must be obtained from 
the bivariate distribution of the numerator and denomi- 
nator in this expression. 

A. Distribution of total energy estimates 

The distribution of errors in the estimate is most easily 
arrived at by initially assuming that the bivariate CLT is 
valid, and then proving that it is so. For a valid bivariate 
CLT the random bivariate (M2, Mi) possesses the PDF 

i 

in the large r limit. The function q is defined in matrix 
notation by 

where (//2,/^i) = (E[wi?], E[i(;]), and C is the covariance 
matrix defined by the elements 

dj = E [w^E'+^-^] - E [wE'-^] E [wE^-^] , (17) 

with i and j G {1,2}. This is the bivariate CLT. 

To demonstrate that the CLT is valid it is sufRcient to 
show that all of the co-moments of the original distribu- 
tion exist ^] , or that 

V"'" = E [(w£;)" (w)"] (18) 
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exists for all non negative m and n. Since the integrand 
is finite for all E, and the asymptotes of w{E) and the 
sampling distribution are known, it follows that the in- 
equality 



< 



< a 



\P,w'''+''E"'\dE 
° 1 

^ 1 + |£:|2n+m+2 



dE 



(19) 



is true for some finite a. Performing the integral explic- 
itly gives 



< 



27ra 



2n + m + 2 



■ CSC 



2n + m + 2 



(20) 



and hence V™'" is finite for all non negative m and n. 

This demonstrates that the bivariate CLT is valid, and 
in addition that asymptotic power law behaviour does 
not occur in the PDF of the random variable (M2, Mi) 
for finite r.[^ 

Now that the validity of the bivariate CLT is estab- 
lished, the distribution of the quotient of the two random 
variables must be considered in order to characterise the 
error in the total energy estimate. Two approaches to 
this problem suggest themselves. The most direct route 
is to extract the PDF of the quotient directly from the 
bivariate Normal distribution. An alternative approach 
is to define a 2-dimensional confidence region for the bi- 
variate distribution. Both are examined here, with the 
second proving to be the most appropriate. 

A PDF of the quotient is defined in terms of the bi- 
variate PDF via the standard formula[J] 

/O p+00 
xPriy = ux,x)dx+ / xPr{y — ux,x)dx. 
00 Jn 

(21) 



Evaluating this explicitly using Equations (|15ll6ll7p . and 
taking the large r limit a second time gives 



Pr (u) 



„l/2 



{Cll^l2 - C12M1) + (C22A^1 - C12H2) 



X exp 



(ciiU^ - 2ci2U + 022)^^^ 



2 (ciiu^ — 2ci2U -I- C22) 



(22) 



hence the distribution of quotients is clearly not Normal 
in the large r limit, even though Pr{fi2, Mi) does approach 
a bivariate Normal distribution. However, the width of 
this distribution scales as r^^/^ in the same manner as 
a Normal distribution, and for (cn, C12, /ii) — > (0,0,1) 
this distribution of total energy estimates approaches a 
Normal distribution with higher power co-moments be- 
coming undefined in the limit. 

For the general covariance matrix the asymptotic be- 
haviour in u is given by 



^1/2 

lim Pr (u) = — = 

\u\^oo \J 111 



CllA*2 - C12M1 



3/2 









r 


exp 






2cii 



1 



(23) 



531 




X 



FIG. 1: Figure shows confidence regions defined for a bivariate 
Normal distribution, Pr(y,x), in order to obtain confidence 
intervals for ratios of the two associated random variables. 
The grey ellipse follows a line of constant Pr, and the straight 
lines enclose a 'wedge' that contains lines of gradient y/x with 
probability a (see main text). 



hence the distribution of total energy estimates possesses 
neither a mean or a variance. At first this seems like a 
serious problem, but it turns out to be irrelevant for two 
reasons. 

The magnitude of the power law tails in Eq. ((23|) de- 
creases exponentially as the number of sample points in- 
creases, which means that for any reasonable set of pa- 
rameters (and for a wide range of unreasonable parame- 
ters) the chance of a sample point appearing in these 
tails is vanishingly small. A typical numerical value for 
the coefhcient of in the asymptotic form for calcula- 
tions actually carried out is ~ 10""*^*^. In addition the 
weight, w{E), falls within the closed interval < X„ < 1, 
and Y„ is also bounded, hence for finite sampling these 
tails do not actually occur. In effect the deviation of the 
finite r distribution from the large r limit conspires to 
remove these undesirable tails. 

The analytic form given above is not the most ele- 
gant approach to visualising the distribution of gradients. 
Confidence intervals for the estimate are more clearly de- 
fined directly from the bivariate normal distribution by 
generalising the one dimensional confidence interval to a 
two dimensional confidence region in the space of the bi- 
variate PDF. To achieve this the approach of Fieller^ is 
adopted, and is best described geometrically (see Fig.[T]). 

An ellipse of constant probability density is defined via 
a new parameter go and the equation 



qlicHellipse, 



{x - Ml) 

[y ~ M2) 



c- 



(x- Ml) 

(2/- M2) 



(24) 



which defines an elliptical probability region that con- 
tains (M2, Ml) with probability aeinpse- 

A 'wedge' is then defined as the region between two 
straight lines that pass through the origin and are tan- 
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gential to this eUipse of constant probabihty density. The 
region contained inside this wedge then defines a second 
confidence region, that contains (M2, Mi) with probabil- 
ity a. Fieller's theorem essentially provides go a func- 
tion of one variable, either aeinpse (via the Hotelling's 
T^-distribution in the large r limit) or a (via the Stu- 
dent's t-distribution in the large r limit). The second of 
these, q{a), provides a confidence interval for the total 
energy estimate from the confidence wedge, since a frac- 



and 



For this confidence interval to be finite the ellipse must 
not cross the x = line since, if it does, the confi- 
dence interval may be A,. [El] > Z„, A,. [El] < k, or even 
—00 < Ar [El] < 00 (these two cases are referred to as 
'exclusive unbounded' and 'completely unbounded' re- 
spectively, with the usual case 'bounded' Q). A check for 
whether these 'unbounded boundaries' occur is straight- 
forward to implement, and is far from being satisfied for 
systems of interest. In addition, finite sample size and 
bounded samples ensure that the unbounded cases never 
occur for the actual (finite r) distribution of errors. 

The magnitude of the confidence interval scales as 
r"^/^. It is not immediately apparent what type of esti- 
mate is provided by this quotient of sample means. It is 
a statistical estimate, as in the limit of increasing r it ap- 
proaches the true total energy, however, it is not an unbi- 
ased estimate, as its distribution has no mean. In fact no 
unbiased estimate of the quotient exists, since the mean 
of a quotient of random variables is not equal to the quo- 
tient of the mean of the random variables. Equation (j25p 
provides a 'central' estimate, in that the probability that 
a sampled estimate value is higher than the true total 
energy is equal to the probability that a sample estimate 
value is less than the true total energy [sl]. 



B. Analysis of data 

In this section calculated total energies and confidence 
limits for an isolated all-electron carbon atom are con- 
sidered, both using standard sampling and residual sam- 
pling. 

A numerical Multi-Configuration-IIartree-Fock calcu- 
lation was performed to generate a multideterminant 
wavefunction consisting of 48 Slater determinants (corre- 
sponding to 7 configuration state functions (CSF)) using 



tion a of (M2,Mi) provide total energy estimates that 
fall between the bounding lines of the wedge. 

Solving for the gradient at the boundaries of the a 
confidence wedge gives 

k < Ar [Etot] < lu with confidence a, (25) 

where luj are the gradients of wedge boundaries and are 
given by 



(26) 



(27) 



the ATSP2K code of Fischer et al.^. Further correla- 
tion was introduced via a 83 parameter Jastrow factor Q, 
and a 130 parameter backflow transformation Q. This 
219 parameter trial wavefunction was optimised using a 
standard variance minimisation method resultin g in 
EvMC — —37.8344(2) a.u., compared with the 'exact '|ig| 
result of —37.8450 a.u. Of those trial wavefunctions that 
can practically be constructed and used in QMC this 
may be considered to be accurate, and reproduces 93.2% 
of the correlation energy at VMC level. Unless other- 
wise stated the parameters {Eo,e) are taken to be the 
estimated total energy and variance of the local energy 
taken from a small standard sampling calculation. This 
choice is justified in what follows. 

The analysis of the sampled local energies uses the for- 
mulae derived above, with the expectation integrals re- 
placed by the normalised sums of samples that are the 
standard unbiased estimates. The sampled estimate of 
the quantity x is denoted x, and sample estimates of the 
bivariate mean and covariance matrix were calculated. 
The primary aim of analysing the data is to characterise 
the statistics of the random error in sample estimates for 
both residual and standard samphng. Generating 10^ 
local energy samples, breaking this set of samples into 
subsets of various sizes and analysing each of the subsets 
individually provides independent sample estimates for 
the total energy and variance, and these are then anal- 
ysed as a set of samples from the underlying distribution, 

Pr- 

Within residual sampling the sample estimate of the 
bivariate mean obtained from r samples is 

(M2,Mi)= -^w„£;„,-^u;„ , (28) 

\ 71=1 ^ n=l / 

and the sample estimate of the covariance matrix ele- 



(r^i./i2 - ggcia) ± \/ {rfii.fj.2 - 90^12)^ - (f/^? - ^qCh) (rf^l - '70^22) 



(JO (a) = \/2 erf-i (a) , 
I 
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merits take the form 
^ 1 

C22 = 7 [WnEn - /i2)^ 

r — 1 ^ — ' 

n=l 

1 

C12 = (Wn^^n - /i2) (W« - V-l) 

r — 1 ^ — ' 

n=l 

1 

cii - 7^K-m)'. (29) 

r — 1 ^ — ' 

n—l 

These provide an estimated value of the total energy 




-10 10 

E — Eq (a.u.) 

FIG. 2: The seed probabihty density function estimated by a 
histogram of r = 10^ sampled local energies using standard 
sampling (black) and residual sampling (grey). These are 
results for an accurate all-electron carbon trial wavefunction, 
as described in the text. Also shown is the model distribution 
of Eq. (|33p that reproduces the mean and variance of the 
samples. 



If required, further information on the deviation of this 
distribution from the large r limit is available from sta- 
tistical estimates of higher co-moments, a fundamentally 
different situation to the standard sampling case. 

Figure [2] shows estimates of the seed PDF, P^2{E), 
constructed from taking 10^ samples of the local energy, 
binning these into intervals, and normalising ^ij. Esti- 
mates are constructed from both standard sampling, for 
which a weight of 1 per sample is binned into the cho- 
sen energy intervals, and residual sampling, for which a 
weight w{En) is binned. In addition the figure shows a 



and accompanying confidence limits 

Etot = ^, (30) 
Ml 

and 

h < Etot < lu with confidence a, (31) 
with the limits given by 



(32) 



'model' distribution of the form 

p{E)^^ (33) 

d^+[E- Etot) 

with a mean and variance of Etot and whose values 
are obtained from the data using the usual unbiased es- 
timates. This is chosen as a simple analytic form that 
reproduces the E~^ asymptotic behaviour that has been 
shown to be present in the seed distribution [l|. 

It is clear that residual sampling takes into account the 
statistics of the local energy for large deviations from the 
estimated total energy far more precisely than standard 
sampling. The energy range of the figure is chosen to 
show the breakdown of standard sampling, but for resid- 
ual sampling the estimated PDF shows the same same 
precision over an interval of around 1000 a.u. In addi- 
tion the expected asymptotic behaviour (and agree- 
ment with the model distribution) are reproduced by the 
estimate over this range. This demonstrates a distinct 
difference between the two approaches - standard sam- 
pling does not sample the nodal surface and this results 
in weak statistical convergence to the underlying PDF, 
whereas residual sampling does sample the nodal surface 
successfully, resulting in a faster statistical convergence 
to the underlying PDF. 

Residual sampling requires a choice of parameters to 
specify the sampling PDF, (_Bo,e). Although the val- 
ues of these parameters influence only the statistics of 
the random errors in estimates, it is important to exam- 
ine how variations in these parameters change the con- 
fidence ranges for estimates. Figure [3] shows the esti- 
mates of lu — h that result from the numerical calcula- 
tions as a function of e. Each datum was obtained using 
r = 10^ samples, for a range of e values, and for a fixed 
Eq = —37.8344 a.u., the standard sampling total energy 



- 90^12) ± \/ (r/Ii-/i2 - '70^12)^ - (t-M? - 9^Cii) (r/ii - 9^022) 



and <7o a function of the required confidence interval via Eq. (|27 
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FIG. 3: Confidence limits for estimates of the total energy 
for residual sampling, as a function of (i5o,e)- Data points 
(with a fitted Pade form to guide the eye) are calculated for 
Bo taken as the standard sampling estimate of the total en- 
ergy. Grey curves are the confidence limits resulting from 
the model distribution of Eq. (|33|l . with A the positive de- 
viation from the exact VMC energy. The horizontal line at 
lu — h = 0.001436 a.u. is the standard sampling limit corre- 
sponding to e approaching infinity. 



estimate for the trial wavefunction. The confidence range 
possesses a well defined minimum for e close to the stan- 
dard deviation of P^2 , and for increasing e approaches 
the standard sampling limit. The optimum confidence 
range (assumed to be at e = a) is approximately 75% of 
that resulting from standard sampling. 

Also shovifn in the figure are the confidence ranges ob- 
tained analytically for the model distribution of Eq. ([55]) . 
The figure shows the same general behaviour for the 
model and actual distribution, with higher accuracy for 
the actual results. The confidence range is shown as sev- 
eral functions of e, with Eq chosen to overestimate the 
true mean value (known for the model distribution) by 
an increasing amount, A. The results show that for the 
model system the presence of an improved confidence in- 
terval is resilient to the deviations of the parameters Eq 
and e from their optimum values. 

For the model distribution the optimum reduction in 
the error relative to standard sampling is a factor of 
0.765, which occurs for {Eo,e) = {Etot,cr) in the large 
r limit. The results suggest that an inaccurate estimate 
of Etot can be used for Eq (an accuracy of better that 0.5 
a.u. should be sufficient), and that an order of magni- 
tude estimate of the variance of the local energy may be 
used for e. Should this be insufficient it is always possible 
to optimise the confidence interval itself with respect to 
variations in (i?o , e) • 

In calculating the confidence intervals in Fig. [3] it is 
implicitly assumed that the large r limit has effectively 
been reached. It is desirable to convincingly show that 
this is in fact the case for the example calculation con- 



FIG. 4: This figure shows the statistics of estimate values of 
the total energy. Scattered points are 100 estimated values 
of the means whose quotient provides total energy estimates. 
The ellipse is the estimated confidence ellipse, and the two 
straight line enclose the estimated confidence wedge described 
in the main body of the text. For a valid bivariate CLT, 68.3% 
of estimates fall within the confidence wedge. 



sidered here. First a 'big' estimate of the bivariate mean 
and covariance matrix is constructed from the 10^ sam- 
ple local energies. Then this set of local energy samples 
is separated into 10^ blocks of 10"* samples, and 10^ esti- 
mates of the bivariate mean are constructed from these 
blocks of data. 

Figure H] shows the confidence ellipse and confidence 
wedge of the r = 10^ estimates predicted using the 'big' 
estimate of the bivariate mean and covariance matrix. 
In addition the 10^ (^2, Mi) estimates are also scattered 
over the figure. Of the sampled bivariates, 62 fall within 
the 68.3% confidence wedge, in good agreement with the 
bivariate CLT, and no suspicious outliers occur. It should 
be noted that a linear combination of the means is plotted 
on the vertical axis of the figure to make the finite width 
of the confidence wedge visible - otherwise the correlation 
between the sample means dominates and all samples 
appear to fall on a line passing through the origin and 
with a gradient given by the total energy. 

This data supports the suitability of the residual sam- 
pling strategy, bivariate CLT, and the accompanying in- 
terpretation of error. 

Finally, an estimate of the PDF for total energy esti- 
mates is constructed from the numerical data, for both 
standard and residual sampling. Dividing the 10^ sam- 
ples into 10^ blocks of 10^ samples provides 10^ sam- 
ple estimates of the total energy in each case. A kernel 
estimate [T]| of the distribution of total energy estimates 
is then constructed using 



P,(E) 



E — Ar [Etot] 



(34) 



mh ^—^ \ h 

where the kernel, O, was chosen to be a centred top- hat 
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the seed distribution of residual sampling, P^, this takes 
the form 



E 



V,2 = 



wiE-EcY 



E[ 



(36) 



The parameter Eq may be varied to minimise the resid- 
ual variance, or taken to be the total energy (the two are 
equivalent if the expectations in the above equation are 
not estimated). 

For standard sampling the CLT is not valid for esti- 
mates of the residual variance This, together with 
the importance of the residual variance in wavefunction 
optimisation methods, makes the development of an im- 
proved residual variance estimator desirable. 



FIG. 5: Estimated PDFs for total energy estimates con- 
structed from different sampling strategies. The unfilled solid 
curve is for standard sampling, and the grey filled curve for 
residual sampling. In both cases, a kernel estimate of the 
PDF was constructed from 10^ total energy estimates, with 
each total energy estimate constructed from r = 10^ samples. 



function of width 1, m = 10'^ is the number of estimates, 
and h is the width parameter, chosen heuristically to pro- 
vide the clearest plot. 

The estimated Pr{E) for standard and residual sam- 
pling is shown in Fig. [51 Although E^'^ asymptotic tails 
are known to be present in the distribution for standard 
sampling total energy estimates, for this particular calcu- 
lation they are not significant at an achievable statistical 
resolution. There is no guarantee that this will be the 
case for other calculations For residual sampling the 
bivariate CLT is valid in its strongest form, hence such 
persistent leptokurtotic tails are guaranteed to be absent. 

Assuming the large r limit has been reached, it is ap- 
parent that residual sampling provides an improved confi- 
dence interval (~ 75% of the standard sampling interval) , 
with an estimated total energy of —37.8344(23) a.u. for 
standard sampling, and —37.8346(16) for residual sam- 
pling. To put this another way, residual sampling re- 
quires approximately half as many samples as standard 
sampling to achieve a given accuracy. 



III. RESIDUAL VARIANCE ESTIMATES AND 
CONFIDENCE LIMITS 

The residual variance, Vs^, is defined as the integral of 
the square of the residual associated with the Schrodinger 
equation, 

mn - Eg) ■ {H - Earn 



where Eq may be considered as a variational 

parameter In terms of expectation of functions over 



A. Distribution of residual variance estimates 

Taking Eq to be the total energy gives Eq. (|36p in the 
form 



5^ 



E [wE^ 
EN 



E[wE] 
E\w] 



(37) 



with a statistical estimate of this quantity provided by 
replacing each expectation by a normalised sum of sam- 
ples. 

A rigorous treatment of the statistics of this estimate 
requires a generalisation of the bivariate analysis to the 
trivariate case using 

(A„,B„,C„) = (w(E„)E2,u;(E„)E„,w(E„)) , (38) 

and the accompanying unbiased estimates of the means 
that form the partially correlated random trivariate, 

(M2,Mi,Mo)- (jEA„,i^B„,i^C„J , (39) 

\ 71—1 n—1 71—1 / 

to provide the estimated residual variance as 

2 



Mo I Mo 



(40) 



Confidence intervals for this quantity may, in principle, 
be obtained by an analogous route to the bivariate case, 
by obtaining an (unbiased) estimate of a 3 x 3 covari- 
ance matrix and defining a confidence region in the 3- 
dimensional space to provide a trivariate CLT and an 
analogue of Fieller's theorem. This added complexity is 
not considered to be necessary here. 

Instead, Eq is interpreted as a variational parameter 
which results in an estimate of the residual variance that 
takes a bivariate form, and that reproduces standard 
sampling for w ~ I and finite r. A random bivariate 
is defined as 



(Y„,X„) = (zi;(E„) (E„ - ^G)^^i;(E„) 



(41) 
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The associated bivariate 



\ ri=l 71=1 / 



(42) 



provides the random variables whose quotient is an esti- 
mate of the residual variance 



(43) 



The prefactor in the definition of M2 ensures that the 
above estimate is unbiased for the case of standard sam- 
pling. As for total energy estimates, the bivariate CLT 
is assumed to be valid in order to define the distribution 
of (M2, Ml), and then shown to be valid. 

Provided the CLT is valid, the large r PDF takes the 
form 



1 r^/^ 
2tt |C|1/2' 



(44) 



and 



for some finite a, or that 



E [w 



2-fc + 2(m-|-n) 



CSC 



~fc + 2(m-|-n) 
(51) 

This inequality is valid for all non-negative m,n and 
< fc < 2m, and hence all co-moments exist. It then 
follows that the bivariate CLT is valid and no asymp- 
totic power law behaviour occurs in the PDF of (M 2, Mi). 
Converting this bivariate distribution into a description 
of the statistics of the residual variance estimate proceeds 
exactly as for the total energy estimates in the previous 
section. All that differs is the definition of the bivariate 
mean and the covariance matrix. 

From this point on, and in all numerical results, we 
choose Eq — Etot, with Etot taken as the estimate of 
the previous section. Any deviation of Eg from the true 
expectation value of the total energy of ^ does not in- 
validate the variational principle for which the residual 
variance is of interest, but it should be borne in mind 
that the relatively small random variation in Eq is not 
taken into account in this error analysis. 



(45) 



The bivariate mean (/i2,/ii) and covariance matrix, C, 
are defined in terms of the supplementary variables 

(x2,a;i) = (w{E - Ecf ,iuj by 



and 



(Ai2,Mi) = (EM,IE[a.i]), 



Cij — E [xiXj] — E[xi]E[a 



(46) 



(47) 



for i and j E {1,2}. This is the bivariate CLT. 

To show that this CLT is valid it is sufficient to show 
that all of the co-moments of the original distribution 
exist. A general co- moment can be expressed in terms of 
the weights and energies as 



2m 



k=0 



(48) 
(49) 



hence it is required to show that E is finite for 

all m, n and < fc < 2m (this includes the co-moments 
associated with Etot)- Noting that the integrand is fi- 
nite for all E, and possesses asymptotes proportional to 
j^k-2-2{m+n) pj-Qyides the inequalities 



E \w 



/OO 
\P,w"'+''E''\dE 
-00 



' —00 

/•OO 



< a 



1 



1 -(_ |£;|2-fc+2(m+T 



-dE, (50) 



B. Analysis of data 

Returning to the all-electron carbon atom, a VMC es- 
timate of the residual variance is required. The same 
local energy samples used for the total energy estimates 
are used to construct the residual variance estimates. 

First a 'central' estimate of the total energy is con- 
structed, 



Etnt — 



T,n=l WnEn 
Yl=l 



(52) 



and this is used to construct an estimate of the mean 
bivariate 

n=l n=l / 



(M2,Mi) = 

and covariance matrix elements 

n=l 
1 ^ 

^— Y E ['^«(-^" ~ Etot)"^ - M2 

n=l 

I 



(53) 



C22 = 



C12 



itotj ~M2 



[Wn - fJ-ll 



Cll 



(54) 



n=l 



Equations (|53l54p provide the sample estimate of the 
residual variance as 



VS2 



V-1 



(55) 
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FIG. 6: Confidence limits for estimates of the residual vari- 
ance for residual sampling, as a function of {Eo,e). Data 
points (with a fitted Fade form to guide the eye) are calcu- 
lated for Eo taken as the standard sampling estimate of the 
total energy. Grey curves are the confidence limits resulting 
from the model distribution of Eq. (|33p . with A the positive 
deviation from the exact total energy. The standard sampling 
limit for this quantity that corresponds to e approaching in- 
finity is not defined. 

with 

li < Vs2 < lu with confidence a, (56) 

and lu/i defined in terms of the new (/l2,/ii) and C using 
the Fieller's theorem and Eq. ([26]). As before, further 
information on the deviation of this distribution from 
the large r limit is available from estimates of higher 
moments. 

Results for the all-electron carbon atom are now con- 
sidered in the same manner as for the total energy esti- 
mates of the previous section, and for the same reason. 
Beginning with the influence of the sampling parameters, 
{Eo,e), on the statistical error. Fig. [S] shows estimates 
of lu — h that result from the numerical calculation for 
a range of values of e. Each datum was obtained us- 
ing r — 10^ samples, and for a fixed Eq = —37.8344 
a.u., the standard sampling total energy estimate for the 
trial wavefunction. As for the total energy estimate, the 
confidence range possesses a well defined minimum for e 
close to the standard deviation of P^2 . However, unlike 
the total energy estimate, this is not a finite reduction of 
the CLT confidence range of standard sampling, since for 
standard sampling the CLT confidence range is not de- 
fined. In other words — is unbounded as e increases, 
and no sample estimate of the standard sampling confi- 
dence interval is shown as such a quantity does not exist. 

The figure also shows the confidence ranges resulting 
from the model seed distribution (Eq. ((33)) ). obtained 
analytically and plotted as functions of e for Eq cho- 
sen to overestimate the true mean value (known for the 
model distribution) by A. The analytic form shows no 




FIG. 7: This figure shows the statistics of estimate values of 
the residual variance. Scattered points are 100 estimated val- 
ues of the means whose quotient provides residual variance 
estimates. The two straight line enclose the estimated confi- 
dence wedge described in the main body of the text. For a 
valid bivariate CLT, 68.3% of estimates fall within the confi- 
dence wedge. 



upper bound, as expected, and suggests that the useful- 
ness of the confidence range is resilient to the deviations 
of the parameters Eq and e from their optimum values. 
Given that no 'standard sampling confidence range' ex- 
ists, the case for improved accuracy for residual sampling 
is stronger than for the total energy estimate. Parameter 
values may be chosen by the same criteria suggested for 
total energy estimates, or by minimising the confidence 
interval itself. 

To justify the validity of having reached the large r 
limit with real numerical results, and the related va- 
lidity of the bivariate CLT, the 10^ sample local ener- 
gies were used to generate 10^ estimates of the bivariate 
mean made up of r = 10* samples each, and an esti- 
mate of the distribution that these are sampled from. 
The quantity Eq was defined as the estimate of the to- 
tal energy defined in section |TT1 evaluated separately for 
each block. Figure [7] shows a confidence wedge predicted 
for the estimates constructed from the sample covariance 
and mean taken from all the samples, and also shows 
the 10^ (/i2,Mi) estimates scattered over the figure. Of 
the sampled bivariates, 66 fall within the 68.3% confi- 
dence wedge, in agreement with the bivariate CLT, and 
no suspicious outliers occur. This also justifies the bi- 
variate interpretation of the residual variance estimate by 
showing that the statistical variation in Eq is not signif- 
icant. Note that the degree of correlation (although not 
complete) prevents the confidence ellipse being visible. 
This data supports the suitability of residual sampling, 
the bivariate CLT, and the accompanying interpretation 
of error for obtaining estimates of the residual variance. 
This is fundamentally different to the standard sampling 
case, where no CLT is valid and the statistical error is 
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FIG. 8: Estimated PDFs for residual variance estimates con- 
structed from different sampling strategies. The unfilled solid 
curve is for standard sampling, and the grey filled curve for 
residual sampling. In both cases, a kernel estimate of the 
PDF was constructed from 10'' residual variance estimates, 
with each residual variance estimate constructed from r = 10"' 
samples. 



uncontrolled. 

Finally, a kernel estimate to the PDF of the residual 
variance estimate is constructed for both standard and 
residual sampling in order to compare the distributions 
of error that result in the two cases. The estimated PDFs 
where constructed by dividing 10^ local energy samples 
into 10^ blocks of 10^ samples, constructing a residual 
variance estimate for each block (using a block by block 
total energy estimate), and then constructing a kernel 
estimate using Eq. . Figure [5] show the resulting es- 
timated PDFs. 

The estimated standard sampling distribution clearly 
demonstrate the invalidity of the CLT, leptokurtotic 
tails, and accompanying outliers predicted for standard 
sampling in a previous paper [l|. The estimated resid- 
ual sampling distribution reflects the error analysis given 
earlier in this section, providing numerical evidence that 
the large r limit of the bivariate CLT has been reached. 

On comparing the properties of the two distributions, 
two main points suggest themselves. Due to the presence 
of power law tails for standard sampling, it provides a 
far wider distribution and is more vulnerable to outliers 
than residual sampling. In addition, for increasing r, the 
statistical spread of estimates scales as r~-'^/^[l| and r~^/^ 
for standard and residual sampling respectively, hence 
standard sampling becomes even less accurate relative to 
residual sampling as the number of samples increases. 

Essentially this data tells us that the random error 
in estimates of the residual variance is very different for 
standard and residual sampling. The CLT fails for stan- 
dard sampling, but is reintroduced for residual sampling, 
so residual sampling provides a confidence interval for the 
residual variance, whereas standard sampling does not. 



In addition the data suggest that the large r limit is eas- 
ily reached for practical sample sizes. The model seed 
distribution of Eq. (|33p and the numerical data for the 
carbon atom suggests a standard sampling error one to 
two orders of magnitude larger than for residual sampling 
for r — 10"^, and this ratio increases as r^/^. 



IV. GENERAL SAMPLING AND MOMENTS 
OF SEED DISTRIBUTION 

The analysis given above has involved only a particu- 
lar sampling/weighting function combination, referred to 
as residual sampling. A more general sampling function 
is now considered in order to show how the presence of 
E^^ asymptotic behaviour in the 'standard' distribution 
of local energies limits the quantities that may be es- 
timated, and the statistics of the random errors in those 
quantities that can be estimated. 

The influence of the chosen weighting/sampling func- 
tions on the applicable limit theorems can be charac- 
terised by its asymptotic behaviour, specifically by the 
inverse power law behaviour of the weight function as 
singularities in the local energy are approached. A large 
E power law behaviour of w oc \E\~p is taken for the 
weight, and used to estimate the q^^ physical moment of 
the seed distribution. 



P^2E''dE. 



(57) 



The limit theorem valid for this moment will also be valid 
for the expectation of any function of E that increases as 
El in the large |£:| hmit. 

The distribution of an estimate of this moment will 
satisfy the CLT in its strongest form if all of the co- 
moments for the sampling strategy characterised by w{E) 
exists, that is if 

V'"^" =E[(u'£:«)™(w)"] , (58) 
exists. This is the case if the inequalities 
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dE (59) 



are satisfied for all non-negative m,n and some finite a. 
The integral on the RHS is finite provided that 



. 3 fq , 

n > 1 \- m \ 1 

P \P 



(60) 



which is true for all non-negative m,n provided that 

p < 3 and q < p. (61) 

If this pair of inequalities is satisfied then the least 
general version of the bivariate CLT (that provides the 
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strongest limits on the deviation from a Gaussian distri- 
bution) is valid for the estimated moment. This is the 
most desirable case, and precludes the presence of power 
law tails for finite r. Note that this inequality demon- 
strates that it is not possible to estimate 3'"'^ moments 
or higher of the P^i , which is not surprising given that 
these integrals are not defined. The value of p is an exclu- 
sive upper limit on the moments that can be estimated 
with strong limits on their statistical error, and cannot 
be greater than or equal to 3. 

The most general version of the bivariate CLT that 
provides no limit on the deviation from a Gaussian 
PDF for finite r is the bivariate form of the Lindeberg 
theorem^. For this theorem to hold requires only the 
1** and 2"'' order co- moments of the estimate to exist, 
resulting in the weaker limits 



p < 3 and q < 



(62) 



So there is a small range of q values between the exis- 
tence of all moments and the complete invalidity of the 
bivariate CLT where power law tails will persist into the 
distribution of statistical errors. No integer q falls in this 
region. 

The q,p values for which all moments exist tells us that 
the CLT with the strongest limits on finite sample error 
is valid for estimates of all the expectations that exist 
for the trial wavefunction. Standard sampling does not 
provide this ideal strategy of sampling and estimation, 
and many of the expectations that exist have estimates 
that either satisfy the CLT with the weakest limits on the 
finite sample error, or do not satisfy the CLT. The case 
p = 2 and g = 1,2 corresponds to the total energy and 
residual variance estimates for residual sampling given in 
the previous two sections. 

This analysis is limited to expectations that can be ex- 
pressed in terms of the local energy field variable. It is 
possible to generalise the analysis given to estimates of 
other quantities in VMC, since expectation values of op- 
erators are generally formulated as expectations of field 
variables (the local energy in the previous analysis) over 
the physical PDF of the system (the A^^ in the above). 
This can always be reformulated through a change of 
random variables to provide the estimate as a mean of a 
lower dimensional PDF. 



V. OTHER ESTIMATES 

It has been shown [l| that for standard sampling the 
CLT fails and the generalised central limit theorem takes 
its place for a variety of estimates of physical quantities. 
This is a direct consequence of singularities appearing in 
the sampled field variable, and may be dealt with using 
alternative sampling. 

An ideal estimator would be one for which the 
strongest form of the CLT provides confidence intervals 



for the estimated quantity. Two complementary ap- 
proaches to creating such estimators naturally suggest 
themselves. A first method (essentially that described 
in the preceding sections for total energy and residual 
variance estimates) is to choose a new sampling strategy 
such that power law tails in the sampled quantities are 
removed. A second method is to construct an alternative 
estimator by adding terms to the sampled quantity that 
have a mean of zero, hence preserving the large sample 
size limit of the estimated quantity, but modifying the 
distribution of random error that occurs for finite sam- 
ple size. Both these approaches play a role in controlling 
the statistical error for general estimates. 

One of the most basic physical quantities for which 
accurate estimates are required is the kinetic energy of 
a system (the electronic kinetic energy for the examples 
considered here) . Estimates of this are straightforward to 
construct in terms of a MC estimate of integrals. Unfor- 
tunately, the integrand generally possesses singularities 
on hyper-surfaces in SA'^-dimensional space and so un- 
controlled random errors occur in the form of power law 
tails in PDFs. 

The most direct kinetic energy estimate is provided by 
the operator in the Hamiltonian, and takes the form 



A. [Eke] = 



(63) 



where K„ = [^iV" ^^r,V']r is a local kinetic energy 
at a random sample point, R„, in 37V-dimensional space, 
and w = 1 corresponds to standard sampling. This local 
kinetic energy possesses singularities for an electron ap- 
proaching a nucleus, for an electron approaching another 
electron, and at the nodal surface, referred as type 1, 2, 
and 3 in 1] (this is true for any ip for which the Kato 
cusp conditions are satisfied, and for which V^tA on 
the nodal surface). For standard sampling, these singu- 
larities remain present in the sampled quantity, and the 
CLT is weakly valid in the sense that x""* asymptotic 
tails are present in the PDF of the estimate for finite 
sample size. For residual sampling, type 3 singularities 
are removed, but types 1 and 2 remain, hence again the 
CLT is weakly valid. In both cases the error is domi- 
nated by the presence of singularities of types 1 and 2, 
and these are unavoidable in the sense that they will be 
present for the exact wavefunction. 

Green's 1^* theorem provides the means to remove the 
type 1 and 2 singularities, giving a new estimate of the 
form 



Ar [Eke\ - TT-F^^r n : 

2 E„=l"'(En) 



(64) 



where = i [X^i -^i--^j]R ' with the sum over all elec- 
trons, and F,: = the drift velocity vector of elec- 
tron i. 

The distribution of the random error in the estimate 
for both the standard and residual sampling case can be 
obtained in the same way as for the total energy and 
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may be still be constructed using a more general sampling 
strategy defined by the estimator 
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FIG. 9: Estimated PDFs for kinetic energy estimates con- 
structed from different sampling strategies. The unfilled solid 
curve is for standard sampling, and the grey filled curve for 
residual sampling. In both cases, a kernel estimate of the PDF 
was constructed from 10'^ kinetic energy estimates, with each 
kinetic energy estimate constructed from r = 10'^ samples. 



residual variance estimates considered previously. The 
sole difference is in the order of the singularities present 
in the averaged field variable. For standard sampling the 
analysis shows that the sum of random variables that 
make up the estimate does not obey the CLT, and an in- 
finite variance Stable distribution with x~^^^ tails results. 
For residual sampling, the summed random variables are 
bounded, hence all co-moments exist, the bivariate CLT 
is valid in its strongest form, and Fieller's theorem pro- 
vides a confidence interval for the estimated kinetic en- 
ergy. Figure [H] shows a kernel estimate of the PDF for 
kinetic energy estimates of the same carbon trial wave- 
function described earlier. The figure explicitly shows 
the failure of the CLT for standard sampling, and the 
improved estimate resulting from residual variance sam- 
pling, both using Eq. ([M)) . 

If we compare the results from the two different types 
of estimator, Eq. (|64p with residual sampling provides 
Eke = 37.894(17) a.u., whereas Eq. with standard 
sampling gives Eke = 37.879(48) a.u. Standard sam- 
pling requires eight times as many samples as residual 
sampling to provide the same accuracy for kinetic energy 
estimates, and, in addition, to obtain the confidence in- 
tervals for standard sampling it must be assumed that 
enough samples have been taken for the power law tails 
to be unimportant. 

Finally, we note that residual sampling can only han- 
dle singularities at the nodal surface. For many esti- 
mates a 'transfer' of singularities with types 1 and 2 to 
the nodal surface may be achieved using 'zero-variance, 
zero-bias' corrections of the form described by Assaraf 
and Caffarel[ll[l3. However, there may be quantities 
for which estimates that possess no type 1 or type 2 sin- 
gularities are unavailable. Estimates of such quantities 



1 J2n=l w(Gn)G„ 

2 ELl«^(Gn) 



(65) 



where G„ = -0 ^Gip ■ The sampling strategy would 
L J R„ 

be defined by choosing w to be a function of G,i that 
ensures that the summands are bounded, all co-moments 
exist, and so the strongest limit theorems apply. 



VI. CONCLUSION 

Previously it has been shown that the distribution of 
statistical errors in the estimates of the two most impor- 
tant basic quantities of variational QMC, provided by 
the most common 'standard sampling' implementation 
of the method, result in an uncontrolled statistical er- 
ror. This results in the presence of unexpected outliers 
in estimates, and the failure of the CLT. Here a more 
general sampling strategy is used, referred to as 'residual 
sampling'. Residual sampling prevents the artificial in- 
troduction of singularities in the sampled quantities that 
is an inherent part of the standard sampling strategy, and 
the accompanying statistical difficulties. The new sam- 
pling strategy reintroduces the CLT for the total energy 
and residual variance in a strong form such that the de- 
viation of the distribution from Normal for finite sample 
size is known and is bounded. 

The 'cost' of residual sampling is that the local energy 
must be evaluated in order to generate sample points 
with the required distribution, increasing computational 
expense, and that the interpretation of the random error 
in estimates is more complicated as the estimate must 
be considered as a quotient of two correlated random 
variables, rather than a single random variable. 

The price of computational cost and complexity may 
be justifiable for estimating the total energy. Numerical 
results for an isolated all-electron carbon atom suggest 
that residual sampling provides a modest improvement 
in the error of the estimated total energy for the all- 
electron carbon atom considered, since for this case lep- 
tokurtotic power law tails are weak for achievable sample 
sizes. However, it should be borne in mind that these 
tails may be stronger for other systems, cannot be ac- 
curately (that is without bias) estimated, and are com- 
pletely removed by residual sampling. 

The increases in cost and complexity is justifiable for 
estimating the residual variance, since residual sampling 
provides a qualitative as well as quantitative improve- 
ment to estimates. The analysis and numerical data 
clearly shows that residual sampling provides a controlled 
and small random error, unlike the standard sampling 
case. This approach to controlling the random error in 
estimates is also expected to be important for other phys- 
ical quantities - the CLT has been shown to be invalid 
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for several estimates [l| and residual sampling, or a vari- 
ant of residual sampling, provides a natural approach to 
achieving a Normal distribution of random error. 

A primary application of the sampling strategies de- 
scribed is expected to be the optimisation of trial wave- 
functions. A considerably smaller number of samples are 
expected to be required to obtain an accurate minimum, 
since the random error of the optimised quantity is not 
Normal for standard sampling but is described by a bi- 
variate Normal distribution for residual sampling. Resid- 
ual sampling also does not require the introduction of ad 
hoc stabilisation methods, such as weight limiting 17]. A 
further feature of the new sampling strategy is that it 
samples the trial wavefunction close to the nodal surface 
- the standard sampling method avoids sampling here - 
the region where the accuracy of the trial wavefunction 
influences the accuracy of subsequent Diffusion Monte 
Carlo (DMC) calculations^. 

An analysis of the statistical errors of estimated quan- 
tities in VMC has not previously been available in the 



literature. An assumption of a valid CLT has repeatedly 
been relied upon to justify methods and results, for both 
the estimation of physical quantities and optimisation of 
trial wavefunctions. The analysis and residual sampling 
approach described here provide a method for predicting 
the random errors in QMC, and designing new sampling 
strategies that control and reduce the random error. It 
also provides the possibility of preferentially optimising 
a trial wavefunction in the region of the nodal surface, 
and so providing a new means to control the fixed node 
error of DMC methods. 
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